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We study the dynamics of protein folding via statistical energy-landscape theory. In particular, 
we concentrate on the local-connectivity case with the folding progress described by the fraction of 
native conformations. We obtain information for the first passage-time (FPT) distribution and its 
moments. The results show a dynamic transition temperature below which the FPT distribution 
develops a power-law tail, a signature of the intermittency phenomena of the folding dynamics. We 
also discuss the possible application of the results to single-molecule dynamics experiments. 
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The study of diffusion along a statistical energy land- 
scape is a very important issue for many fields. In 
the field of protein folding, the crucial question is how 
the many possible configurations of a polypeptide chain 
dynamically converge to a particular folded state jj]. 
Clearly, a statistical description is needed for a large 
number of configurational states. According to the 
energy-landscape theory of protein folding j^-Q], there 
exists a global bias of the energy landscape towards the 
folded state due to natural evolution selection. Super- 
imposed on this is the fluctuation or roughness of the 
energy landscape coming from competing interactions of 
the amino acid residues. The folding energy landscape is 
like a funnel, and there are in general multiple routes to- 
wards the folded state. Discrete paths emerge only when 
the underline energy landscape becomes rough, and local 
traps (minima) start to appear. In addition to the ther- 
modynamics of the folding-energy landscape, the kinetics 
of folding along the order parameter that represents the 
progress of folding towards the native state can be dis- 
cussed. 

When the energy landscape is smooth, the average dif- 
fusion time is a good parameter for the characterization 
of the dynamical process. On the other hand, when the 
energy landscape is rough, there exist large fluctuations 
of the energies, and the diffusion time is expected to fluc- 
tuate very much around its mean. In this case we need 
to know the full distribution of the diffusion time in char- 
acterizing the folding process. 

It is now possible to measure the reaction and fold- 
ing dynamics of individual molecules in the laboratory 
H||. On complex energy landscapes such as those of 
biomolecules, reactions in general do not obey exponen- 
tial laws, and activation processes often do not follow 
the simple Arrhenius relation. However, measurements 
on large population of molecules usually cannot distin- 
guish whether such complex rate dynamics is from the 
inhomogeneous distribution among molecules or from in- 
trinsic features of individual molecules. The study of 



the statistics of individual molecular reaction events can 
greatly clarify these more subtle reaction processes M. 
The information on the diffusion-time distribution pro- 
vides a way to help unravel the fundamental mechanism 
of single molecule reactions. Previously many works have 
focused on the average rate behavior |^,^|, whereas very 
few physical studies and discussions addressed on the 
whole distribution of folding rates. In this paper, we 
concentrate on the statistics and distributions of the first 
passage time (FPT) to probe the folding energy land- 
scape. A dynamic transition temperature is found above 
which the FPT distribution is Poisson-like, and below 
which the distribution develops a power-law tail, where 
non-self-averaging behavior in kinetics emerges. More- 
over, we find that this dynamic transition temperature is 
close to the thermodynamic folding transition tempera- 
ture. 

The framework we adopt here was first introduced by 
Bryngelson and Wolynes The problem of folding 

dynamics is modelled as random walks on a rough en- 
ergy landscape. In this model, the energy landscape is 
generated by the random-energy model fllQfl , which as- 
sumes that interactions among non-native states are ran- 
dom variables with given probability distributions. For 
this model there are N residues in a polypeptide chain. 
For each residue there are v + 1 available conformational 
states, one being the native state. A simplified version 
of the polypeptide chain energy is expressed as 

E = - ej(ai) - 2J Ji,i+i{oti,aii + i) - }^K it j(ai, ay) 

(1) 

where the summation indices i and j are labels of amino 
acid residues, and «i is the state of ith residue. The three 
terms represent the one-body potential, the two-body in- 
teractions for nearest-neighbor residues in sequence, and 
the interactions for residues close in space but not in se- 
quence, respectively. In this random energy model the 
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energies for the non-native states and interactions are re- 
placed by random variables with Gaussian distributions. 
By this random energy construction one can easily gener- 
ate energy surfaces with roughness controlled by adjust- 
ing the widths of these probability distributions. Another 
feature of the model is the random-energy approxima- 
tion, which assumes that energies for different configu- 
rations are uncorrelated. Using a microcanonical ensem- 
ble analysis, the average free energy and thermodynamic 
properties of the polypeptide chain can be obtained jlO) . 

In this study, we use the fraction of native conforma- 
tions p as an order parameter that represents the progress 
of the folding. When the kinetics is dominated by the ac- 
tivation folding process, the states are in general locally 
connected in p. Therefore we assume the local connectiv- 
ity condition here, assuring that the dynamics happens 
continuously with p. The kinetic process is carried out 
with the use of Metropolis dynamics. Therefore the tran- 
sition rate from one conformation state to a neighboring 
state is determined by the energy difference of these two 
states, and an overall constant R characterizes the time 
scale of residue interactions. The readers are referred to 
Ref. [2,11] for detailed derivations of the dynamic equa- 
tions. In brief, the energy landscape is first categorized 
by the order parameter p, along which an energy distri- 
bution function is derived via Eq. (1). From this energy 
distribution function, along with the use of Metropolis 
dynamics, one can obtain expressions for the waiting- 
time distribution function and also the rate distribution 
P(R,p) for transitions between successive p's. Finally, 
using continuous-time random walks (CTRW) and as- 
suming the system to be in quasi-equilibrium, the follow- 
ing generalized Fokker-Planck equation in the Laplace- 
transformed space is obtained O: 



sG(p,s) - Ui{p) 
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where U(p, s) = F(p)/T + log[D(p, s)/D(p, 0)], and 



F(p)=N{-{Se- — )p-[SL 
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(3) 



Here s, which has a unit of inverse time, is the Laplace 
transform variable over time r, G(p, s) is the Laplace 
transform of G(p, t), which is the probability den- 
sity function, and G(p, r)dp gives the probability for a 
polypeptide chain to stay between p and p + dp at time 
r, while rn(p) is the initial condition for G(p,r). F(p) is 
the average free energy for the polypeptide chain. T is 
the temperature, v 4- 1 is the number of conformational 
states of each residue, and 5e and 8L are energy differ- 
ences between the native and average non-native states 
for one- and two-body interactions, respectively. Ae and 
AL are energy spreads of one- and two-body non-native 



interactions. Note that the two-body energies 8L and AL 
already include the type of interactions due to the sec- 
ond and third term in Eq. (1). D(p, s) is the frequency- 
dependent diffusion coefficient [||: 



D(p, s) 




\2N 2 J \R+s/ R 



(P), (4) 



where X(p) = 1/v + (1 — \ jv)p. The average ( )p is 
taken over P(R, p), the probability distribution function 
of transition rate R from one state with order parameter 
p to its neighboring states, which may have order param- 
eters equal to p— jj, p, or p+ -L. The explicit expression 
of P(R, p) can be found in Ref. [2]. The boundary condi- 
tions for Eq. (2) are set as a reflecting one at p = and 
an absorbing one at p = pf. The choice of an absorb- 
ing boundary condition at p = pf facilitates our calcula- 
tion for the first passage-time distribution. One can also 
rewrite Eq. (2) in its integral-equation representation by 
integrating it twice over p: 



G(p,s) = - [ Pl dp 1 f P dp"[sG{p",s)-n i {p" 
J p Jo 

exp[U((/,a)-U(p,a)] 
D(p',s) 



(5) 



The folding time is approximated by the first passage 
time (FPT) for the order parameter to reach pf. Thus 
one can easily write down the following relation for the 
FPT distribution function Pfpt{t) ■ 
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where S(t) = f£ f dpG(p,r). The nth moment of the 
FPT distribution function can be calculated via (r n ) = 
nl(-l)"- 1 dpG n -i(p), where G(p,s) = G (p) + 
sGi(p) + s 2 G 2 {p) + ■■■■ By Taylor-expanding Eq. (5) 
with respect to s, we can solve for G n (p) and therefore 
(r n ) iteratively by matching the coefficients of s n . In the 
mean time, one can also solve for G(p, s) directly from 
Eq. (5) using the linear inversion technique. Observing 
that Pfpt(s) = 1 — sS(s), where Pfpt{s) and S(s) are 
Laplace transforms of Pfpt(t) and S(r), respectively, 
one can calculate Pfpt{s) and investigate Pfpt{t) via 
an inverse Laplace-transform technique. 

We start the numerical calculations by setting Rq = 
10 9 s _1 , N — 100 and v — 10, which are approximations 
to the physical values We confine ourselves to the 
single domain proteins with size less than 100. 
The proteins with larger size tend to form do- 
mains which is beyond the scope of the current 
mean field approximation. For simplicity we as- 
sume 5e = SL and Ae = AL. Therefore the ratio of the 
energy gap between the native state and the average of 
non-native states over the spread of non-native states, 
Se/Ae, becomes an appropriate parameter, representing 
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the importance of gap bias towards the folded state rel- 
ative to the roughness of the landscape. One can show 
that only the relative ratios among Se, Ae and T are the 
controlling parameters in this problem. We set the ini- 
tial distribution of the polypeptide chain molecules to be 
n i(p) = ${p ~ Pi)i where pi is set to be 0.05. In our cal- 
culations we set pf = 0.9. This means that 90 percent of 
the amino acid residues are in their native states. 

The mean first passage time (MFPT) (r) for the fold- 
ing process versus a scaled inverse temperature, Tq/T, 
is plotted in Fig. 1 for various settings of the parameter 
6e/Ae. We have a inverted bell-like curve for each fixed 
Se/Ae, and the MFPT reaches its minimum at a tem- 
perature To. At high temperatures, the MFPT is large 
although the diffusion process itself is fast (i.e., D(p,s) 
is large). This long-time folding behavior is due to the 
instability of the folded state. The MFPT is also large 
at low temperature, which indicates that the polypeptide 
chain is trapped in low-energy non-native states. This is 
in agreement with simulation studies 0. 
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FIG. 1. MFPT versus reduced inverse temperature To/T 
for various Se/Ae. As Se/Ae increases, the minimum of 
MFPT decreases. 

By comparing the MFPT minimum for various Se/Ae, 
we conclude that this minimum becomes smaller when 
the ratio of the energy gap versus roughness increases. 
This suggests that a possible criterion for selecting the 
subset of the whole sequence space leading to well- 
designed fast folding protein is to maximize Se/Ae. In 
other words, one has to choose the sequence subspace 
such that the global bias overwhelms the roughness of 



the energy landscape [14 



We also calculate the higher-order moments of the FPT 
distribution. In Fig. 2 we show the behavior of the re- 
duced second moment, (t 2 )/(t) 2 . We find that the re- 
duced second moment starts diverging at a temperature 
around and below To, where the MFPT is at its mini- 
mum. This is an indication of a long tail in the FPT 
distribution. The divergence of the second moment also 
shows that the dynamics exhibits non-self-averaging be- 
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FIG. 2. (r 2 ) / (t) 2 versus reduced inverse temperature To/T 
for various Se/Ae. At high temperature this value keeps finite 
and the folding process is self-averaging. As the temperature 
drops, the value starts to diverge and non-self-averaging be- 
havior emerges. 

From the study of higher moments, we find the re- 
lationship (t™) m 7j!(t) ti when T > T . Therefore in 
the high-temperature regime the FPT distribution func- 
tion is Poissonian and decays exponentially at large time. 
When T < To, it is hard to obtain more information 
from the moments because of their diverging behavior. 
However, we can study the problem by solving Eq. (5) 
directly. By investigating the behavior of the FPT distri- 
bution function in the Laplace-transformed space, we see 
that for T < Tq the FPT distribution is very similar to a 
Levy distribution in time space, which develops a power- 
law tail at large time: Pfpt(t) ~ t~( 1+q ) for large r. In 
Fig. 3 we make a plot of the exponent a versus To/T for 
the case Se/Ae = 4.0. We find that a is decreasing when 
the temperature is lowered, and a approaches 1 when T 
goes to Tq, where the exponential kinetics is resumed. 
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FIG. 3. The exponent a versus Tq/T for the case 
8e/Ae = 4.0 when Tq/T > 1. Below the transition tempera- 
ture To, the FPT distribution is close to a Levy distribution, 



which has a power-law tail Pfpt{t) ~ r 



-(l+a) 



at large r. 



From the results above, we find that for a fixed-energy 
landscape, there exists a dynamic transition tempera- 
ture To When the temperature is above To, the FPT 
distribution is Poissonian, indicating exponential kinet- 
ics, and in random- walk language we have normal dif- 
fusion on the energy landscape. Below T , the variance 
and higher moments diverge, and the FPT distribution 
shows a power-law decay behavior, exhibiting signs of 
anomalous diffusion. This indicates the process is non- 
self-averaging and distinct folding pathways emerge at 
various time scales. As a comparison, we have calculated 
the thermodynamic folding-transition temperature Tf by 
identifying the maximum heat capacity. We find that Tf 
is less than but close to To for various settings of Se/Ae. 
This indicates that the thermodynamic and dynamic be- 
havior in proteins are strongly correlated. However, 
recent simulation and experimental results []ll] , [l2|| 
show that Tf > Tq and folding is faster below Tf 
but above To. This indicates a possible limita- 
tion of the present analytical study. Furthermore, 
the crossover behavior near To has been shown to 
be smoother than what we observe here. This is 
probably due in part to the insufficient coopera- 
tive interactions in our study [11|. We will ad- 



dress these issues in detail in a future publication 



[13| 



In single-molecule folding experiments, it is now possi- 
ble to measure not only the mean but also the fluctuation 
and moments as well as the distribution of folding time 
[p~5|| - Under different experimental and sequence condi- 
tions, one can sec different behavior of the folding time 
and its distributions. A well-designed fast folding se- 
quence with suitable experimental condition exhibits self- 
averaging and simple rate behavior. Multiple routes are 
parallel and lead to folding. A less well-designed sequence 
(with larger Ae) folds slowly and often exhibit non-self- 
averaging non-exponential rate behavior, indicating the 
existence of intermediate states or local traps. In this 
case, the folding process is sensitive to which kinetic 
path it takes, since a slight change in a folding pathway 
may cause large fluctuation in the folding time, which 
indicates intermittency. One can use single-molecule ex- 
periments to unravel the fundamental mechanisms and 
intrinsic features of the folding process. In typical ex- 
periments of bulk molecules, it is very hard to observe 
and analyze the intermittency, because the dynamics is 
averaged over an ensemble of molecules and furthermore, 
we cannot say if the bulk phenomena results are either 
from the intrinsic features of individual molecules or the 
inhomogeneous averages over the molecules. 

It is worth mentioning that although we focus on the 
study of the protein-folding problem in this paper, the ap- 
proach we use here is very general for treating problems 



with barrier crossings on a multi-dimensional complex 
energy landscape. The main ingredient for this model 
is Brownian motion on a rough multi-dimensional land- 
scape, or equivalently, a random walk on a complex net- 
work with a frustration- inducing environment. Since this 
is quite general and universal, we expect our results may 
also be able to account for a large class of phenomena. 
In fact the experiments on glasses, spin glasses, viscous 
liquids and conformational dynamics already show the 
existence of non-exponential distributions at low tem- 
perature. In particular, a recent experiment on single- 
molecule enzymatic dynamics |i~6| shows explicitly the 
Levy-like distribution of the relaxation time for the un- 
derlining complex protein energy landscape. An interest- 
ing study of anomalous diffusion and non-exponential dy- 
namics has been made recently using a Fractional Fokker- 
Planck Equation (FFPE) JlTj to describe dynamic pro- 
cesses characterized by Levy distributions. Our results 
show that the approach we use here can serve as a mi- 
croscopic basis for the use of such a FFPE. 
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